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Our  research  program  supported  by  this  grant  spanned  several  areas  of 
mathematics  and  data  science.  It  resulted  in  significant  discoveries  in  high¬ 
dimensional  inference  and  high-dimensional  probability  and  lead  to  a  variety 
of  applications  in  statistics,  biomedical  data  analysis,  quantization,  dimen¬ 
sion  reduction,  and  networks  science. 

1.  High-dimensional  inference  and  geometry 

Our  main  and  surprising  discovery  was  how  that  many  classical  methods 
that  were  designed  for  structured  linear  regression  provably  work  even  for 
non-linear  data  PESIEI].  The  non-linearity  can  be  very  general:  discon¬ 
tinuous,  not  one-to-one,  and  even  unknown.  In  spite  of  this,  we  showed 
that  methods  for  linear  regression,  such  as  Lasso,  stay  unharmed  even  in 
presence  of  such  nonlinearities.  This  dramatically  extends  the  range  of  sta¬ 
tistical  models  for  which  data  analysis  can  be  done  rigorously.  Our  findings 
have  found  a  variety  of  applications  for  quantization  and  compressed  sensing 
|20j ,  as  well  as  in  the  analysis  of  biomedical  data  mm- 

As  an  example,  our  results  apply  for  binary,  0/1  measurements,  which 
arise  e.g.  in  classihcation  problems  and  quantization.  For  such  measure¬ 
ments,  we  also  expanded  the  range  of  probability  distributions  the  non-linear 
recovery  results  apply  for.  Our  original  analysis  for  non-linear  data  nil  El] 
was  done  under  the  premise  of  gaussian  measurements.  In  the  new  work  [T], 
we  showed  extended  it  to  general  nonlinear  sub-gaussian  measurements. 

In  the  area  of  discrete  and  computational  geometry,  we  analyzed  how 
many  random  hyper  planes  are  needed  to  cut  a  given  set  K  in  into  much 
smaller  pieces  m-  It  turned  out  that  the  optimal  number  of  cutting  hyper¬ 
planes  is  proportional  to  the  single  geometric  parameter  of  the  set  K,  namely 
the  effective  dimension  d{K).  This  complexity  parameter  is  also  known  to 
govern  the  efficacy  of  algorithms  in  high  dimensional  inference  and  com¬ 
pressed  sensing.  In  particular,  the  optimal  number  of  measurements  in  our 
work  on  non-linear  data  happened  to  be  proportional  to  the  same  parameter 
-  the  effective  dimension  of  the  feasible  set  K,  see  |21].  Through  this  link, 
our  work  on  cutting  hyperplanes  has  implications  in  quantization,  coding, 
dimension  reduction,  and  compressed  sensing. 

Another  natural  measure  of  complexity  of  a  convex  set  K  is  the  number  of 
faces  of  a  polytope  P  that  approximates  K  to  within  a  constant  precision. 
To  encode  a  high- dimensional  convex  body  in  a  form  allowing  computer 
processing,  one  has  to  construct  an  oracle,  i.e.,  an  algorithm  that  using  co¬ 
ordinate  of  a  point  as  an  input,  outputs  whether  the  point  is  contained  in 
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the  body,  or  not.  Construction  of  an  oracle  for  a  general  convex  body  is 
known  to  be  computationally  hard.  A  potentially  possible  way  to  bypass 
this  obstacle  was  suggested  by  Barvinok.  He  proposed  to  approximate  a 
given  body  by  a  projection  of  a  section  of  a  simplex  in  a  higher  dimension. 
This  new  body,  being  the  feasible  set  of  a  linear  programming  problem, 
allows  an  efficient  construction  of  the  oracle.  The  complexity  of  this  con¬ 
struction  is  determined  by  the  dimension  of  the  simplex.  An  approximation 
with  a  simplex  of  the  dimension  polynomial  in  the  dimension  of  the  origi¬ 
nal  body  would  have  allowed  to  circumvent  the  computational  hardness  of 
the  oracle  construction.  In  the  previous  work  of  the  co-PI  in  collaboration 
with  A.  Litvak  and  N.  Tomczak-Jaegermann  showed  that,  in  general,  such 
approximation  is  impossible.  This,  however,  left  open  a  possibility  of  con¬ 
struction  such  approximation  for  some  important  classes  of  convex  bodies, 
primarily,  for  convex  bodies  with  coordinate  symmetries.  Nevertheless,  we 
showed  in  m  that  even  such  highly  symmetric  convex  bodies  require  a 
simplex  of  exponential  dimension  to  produce  such  approximation,  making 
bypassing  the  hardness  obstacle  impossible. 


2.  Networks 

In  the  area  of  network  analysis,  we  developed  and  rigorously  analyzed 
algorithmic  methods  for  finding  structure  in  sparse  networks  [6l[71[5].  There 
had  been  an  abundance  of  algorithmic  methods  for  data  mining  in  relatively 
dense  networks,  where  an  average  vertex  has  degree  >  log  n,  i.e.  is  connected 
to  at  least  >  logn  other  vertices  or  so.  Most  of  these  methods,  including 
the  most  popular  Principal  Component  Analysis  (PC A),  manifestly  fail  for 
sparser  networks,  in  particular  for  those  with  constant  average  degrees. 

Practitioners  had  suggested  that  the  problem  for  sparse  networks  lies  in 
the  vertices  of  abnormally  high  degrees,  and  suggested  that  regularizing 
those  vertices  by  pruning  or  lowering  their  weight  could  solve  the  problem. 
We  confirmed  this  rigorously  by  showing  a  very  general  result:  any  regular¬ 
ization  pre-processing  which  brings  the  degrees  down  to  normal  provably, 
leads  to  spectral  concentration,  and  therefore  makes  spectral  methods  like 
PCA  work  |7]. 

In  a  related  development  [5],  we  proved  for  the  first  time  that  meth¬ 
ods  based  on  semidefinite  programming  also  work  for  structure  discovery  in 
sparse  networks.  Our  analysis  is  based  on  Grothendieck’s  inequality.  In  all 
previous  applications  in  theoretical  computer  science  had  only  yielded  ap¬ 
proximation  to  within  some  fixed  constant  factor.  We  demonstrated  a  new 
method  where  Grothendieck’s  inequality  can  be  used  to  give  an  arbitrarily 
fine  approximation. 
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For  both  methods,  our  theory  is  applicable  for  a  far  wider  class  of  networks 
than  the  benchmark  class  of  stochastic  block  models  that  is  usually  discussed 
in  network  science  results. 

3.  Permanents,  hafnians  and  perfect  matchings 

In  numerical  linear  algebra,  we  studied  the  fastest  known  randomized 
approximation  algorithm  for  computing  the  permanents  of  matrices  with 
non-negative  entries,  namely  the  Barvinok-Godsil-Gutman  estimator.  The 
permanent  is  an  important  computational  characteristic  which  counts,  for 
instance,  the  number  of  perfect  matchings  in  a  bipartite  graph.  Besides  this, 
permanents  arise  naturally  in  the  study  of  contingency  tables,  evaluation 
of  the  expected  product  of  dependent  normal  random  variables,  etc.  It  is 
known  that  the  evaluation  of  a  permanent  is  a  ^-P  hard  problem,  so  taking 
into  account  the  limitations  of  the  computing  power,  one  can  hope  only 
to  estimate  it.  Barvinok-Godsil-Gutman  estimator  probabilistic  estimator 
is  the  fastest  known  means  of  estimating  the  permanent.  In  the  worst- 
case  scenario,  it  outputs  the  permanent  with  the  multiplicative  error  which 
is  exponential  in  the  size  of  the  matrix.  Yet,  it  has  been  observed  that, 
typically,  the  actual  performance  of  this  estimator  is  much  better  than  the 
wort  case.  We  discovered  a  sufficient  condition  on  a  deterministic  graph  or 
matrix  guaranteeing  a  smaller  error  for  estimating  the  permanent  m- 

In  a  related  development  in  computational  graph  theory  |16].  we  analyzed 
a  probabilistic  algorithm  for  estimating  the  number  of  perfect  matchings 
in  general  graphs.  Unlike  bipartite  graphs,  where  the  number  of  perfect 
matchings  is  represented  by  the  permanent  of  the  adjacency  matrix,  in  a 
general  case,  it  is  evaluated  by  a  much  more  complex  quantity,  namely  the 
hafnian  of  the  same  matrix.  Because  of  this  additional  complexity,  almost  all 
known  methods  of  estimating  the  number  of  perfect  matchings  which  were 
developed  for  bipartite  graphs  fail  for  the  general  ones.  The  only  exception 
is  the  Barvinok  estimator  which  is  currently  the  unique  polynomial  time 
probabilistic  estimator  for  the  number  of  perfect  matchings.  This  fact  makes 
the  error  analysis  for  this  estimator  especially  important.  As  for  bipartite 
graphs,  the  worst  case  error  of  this  estimator  is  exponential  in  the  size  of  the 
graph.  We  showed  that  if  the  graph  possesses  certain  expansion  properties, 
then  the  error  of  the  Barvinok  estimator  is  much  smaller  than  in  the  worst 
case. 


4.  Random  matrix  theory 

Several  significant  advances  were  made  in  random  matrix  theory  and  its 
applications.  We  established  delocalization  of  eigenvectors  for  a  wide  class 
of  random  matrices  [mils].  This  means  that  with  high  probability,  every 
eigenvector  of  a  random  matrix  is  delocalized  in  the  sense  that  any  subset  of 
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its  coordinates  carries  a  non-negligible  portion  of  its  norm.  Our  results 
pertain  to  a  wide  class  of  random  matrices,  including  matrices  with  inde¬ 
pendent  entries,  symmetric  and  skew-symmetric  matrices,  as  well  as  some 
other  naturally  arising  ensembles. 

Next,  we  analyzed  in  [2]  the  condition  number  oi  sparse  random  matrices, 
a  quantity  important  for  controlling  the  running  time  and  the  precision  of 
various  numerical  linear  algebra  algorithms.  This  is  an  important  problem 
especially  for  sparse  random  matrix,  which  are  among  the  basic  tools  in 
statistics,  computer  sciences  and  signal  processing.  As  we  increase  sparsity, 
we  found  that  the  condition  number  stays  nearly  the  same  as  for  a  dense 
matrix  almost  until  the  transition  point  where  an  entire  zero  row  appears 
(at  which  point  it  obviously  jumps  to  infinity). 

Furthermore,  we  showed  how  to  improve  the  behavior  of  a  random  matrix 
by  modifying  a  small  fraction  of  its  entries  [2].  We  studied  the  conditions 
where  the  operator  norm  of  a  random  matrix  A  can  be  reduced  to  the 
optimal  order  by  zeroing  out  a  small  submatrix  of  A.  We  found  that  this  is 
possible  if  and  only  if  the  entries  of  A  have  zero  mean  and  finite  variance. 
Moreover,  we  obtained  an  almost  optimal  dependence  between  the  size  of 
the  removed  submatrix  and  the  resulting  operator  norm. 

Finally,  we  developed  a  simple  and  general  tool  for  bounding  the  devia¬ 
tion  of  random  matrices  on  arbitrary  geometric  sets  [H].  This  new  devia¬ 
tion  inequality  unified  many  existing  results,  such  as  Johnson-Lindenstrauss 
Lemma  which  plays  a  major  role  in  dimension  reduction,  M*  bound  in  high¬ 
dimensional  convex  geometry,  and  a  non-asymptotic  version  of  Bai-Yin  lim¬ 
iting  law  in  random  matrix  theory.  On  top  of  that,  our  deviation  inequality 
led  to  several  new  applications,  in  particular  for  dimension  reduction,  model 
selection,  structured  regression  and  compressed  sensing  [8]. 
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computational  graph  theory,  we  studied  a  randomized  algorithm  for  estimating  the  number  of  perfect 
matchings  in  general  graphs.  In  random  matrix  theory,  we  established  delocalization  of  eigenvectors  for  a 
wide  class  of  random  matrices,  proved  a  sharp  invertibility  result  for  sparse  random  matrices,  showed  how 
to  improve  the  norm  of  a  general  random  matrix  by  removing  a  small  submatrix,  and  developed  a  simple 
and  general  tool  for  bounding  the  deviation  of  random  matrices  on  arbitrary  geometric  sets.  This  has 

applications  for  dimension  reduction,  regression  and  compressed  sensing. 
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